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There are many beautiful formulas for n (see for example (H). The purpose 
of this note is to introduce an alternate derivation of Wallis's product formula, 
equation £T|), which could be covered in a first course on probability, statistics, 
or number theory. We quickly review other famous formulas for n, recall some 
needed facts from probability, and then derive Wallis's formula. We conclude by 
combining some of the other famous formulas with Wallis's formula to derive an 
interesting expression for log(7r/2) (equation ©). 

Often in a first-year calculus course students encounter the Gregory -Leibniz for- 
mula, 
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The proof uses the fact that the derivative of arctan x is 1/(1 + x 2 ), so tt/4 = 
Jq dx/(l + x 2 ). To complete the proof, expand the integrand with the geometric 
series formula and then justify interchanging the order of integration and summa- 
tion. 

Another interesting formula involves Bernoulli numbers and the Riemann zeta 
function. The Bernoulli numbers Bk are the coefficients in the Taylor series 

t 1 _ i_ \ - B k t k 

k=2 

each Bk is rational. The Riemann zeta function is £(s) = Y^=i n ~ S ' which 
converges for real part of s greater than 1. Using complex analysis one finds (see 
for instance [10, p. 365] or [18, pp. 179-180]) that 

yielding formulas for ir to any even powerQ In particular, 7r 2 /6 = £ n n~ 2 and 
^ 4 /90 = £„n- 4 - 
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1 An amusing consequence of these formulas is a proof of the infinitude of primes. Using unique 
factorization, one can show that £(s) also equals Yl p {^ ~ P~ s ) _1 > where p runs over all primes. As 
7r 2 is irrational and ("(2) = vr 2 /6, there must be infinitely many primes: if there were only finitely 
many then 7r 2 /6 = UJ p (1 — p -2 ) -1 would be rational! See [131 for explicit lower bounds on ir(x) 
derivable from upper bounds for the irrationality measure of C(2), and 1 14| for more details on the 
numerous connections between £(s) and number theory. 
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One of the most interesting formulas for tt is a multiplicative one due to Wallis 
(1665): 
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Common proofs use the infinite product expansion for sinx (see [18, p. 142]) or 
induction to prove formulas for integrals of powers of sinx (see [3, p. 115]). We 
present a mostly elementary proof using standard facts about probability distribu- 
tions encountered in a first course on probability or statistics (and hence the title)0 
The reason we must write "mostly elementary" is that at one point we appeal to the 
Dominated Convergence Theorem. It is possible to bypass this and argue directly, 
and we sketch the main ideas for the interested reader. 

Recall that a continuous function f(x) is a continuous probability distribution if 
(1) f(x) > and (2) f(x)dx = 1. We immediately see that if g(x) is a non- 
negative continuous function whose integral is finite then there exists an a > such 
that ag(x) is a continuous probability distribution (take a = 1/ g{x)dx). This 
simple observation is a key ingredient in our proof, and is an extremely important 
technique in mathematics; the proof of Wallis's formula is but one of many applica- 
tions]! In fact, this observation greatly simplifies numerous calculations in random 
matrix theory, which has successfully modeled diverse systems ranging from en- 
ergy levels of heavy nuclei to the prime numbers; see for introductions to 
random matrix theory and lITTi for applications of this technique to the subject. 
One of the purposes of this paper is to introduce students to the consequences of 
this simple observation. 

Our proof relies on two standard functions from probability, the Gamma func- 
tion and the Student i-distribution. The Gamma function T(x) is defined by 

POO 

T(x) = / e-H^dt. 
Jo 

Note that this integral is well defined if the real part of x is positive. Integrating by 
parts yields F(x + 1) = xT(x). This implies that if n is a nonnegative integer then 
T(n + 1) = n\; thus the Gamma function generalizes the factorial function (see 
iTTTl for more on the Gamma function, including another proof of Wallis's formula 
involving the Gamma function). We need the following: 
Claim: T(l/2) = 0F. 



For a statistical proof involving an experiment and data, see the chapter on Buffon's needle in 
1 1 1 (page 133): if you have infinitely many parallel lines d units apart, then the probability that a 
"randomly" dropped rod of length £ < d crosses one of the lines is 2l/nd. Thus you can calculate tt 
by throwing many rods on the grid and counting the number of intersections. 

A nice application of Wallis's formula is in determining the universal constant in Stirling's for- 
mula for n\; see [ 15 1 for some history and applications. 
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Proof. In the integral for T(l/2), change variables by setting u = \ft (so dt = 
2udu = 2\/tdu). This yields 

fOO fOO 

r(l/2) = 2 e~ u2 du = / e~ u2 du. 

JO J-oo 

This integral is well-known to equal (see page 542 of 0). The standard proof 
is to square the integral and convert to polar coordinates: 



T(l/2) 2 = / e~ u2 du I" e~ v2 dv = I" F* e~ r \drd6 = it. 

J-oo J-oo Jo Jo 

□ 

In fact, our proof above shows 

1= e- t2 ' 2 dt = 1. (2) 
2tt 



This density is called the standard normal (or Gaussian). This is one of the most 
important probability distributions, and we shall see it again when we look at the 
Student i-distribution. If g is a continuous probability density, then we say that 
the random variable Y has distribution g if for any interval [a, b] the probabil- 
ity that Y takes on a value in [a,b] is g(y)dy. The celebrated Central Limit 
Theorem (see [6, p. 515] for a proof) states that for many continuous densities 
g, if Xx, . . . , X n are independent random variables, each with density g, then as 
n — ► oo the distribution of (Y n — / u)/(cr/y / n) converges to the standard normal 
(where Y n = (X\ + • • • + X n )/n is the sample average, /j, is the mean of g, and a 
is its standard deviation^). 

The second function we need is the Student t-distribution (with v degrees of 
freedom): 

here v is a positive integer and t is any real number. 

Claim: The Student t-distribution is a continuous probability density. 

Proof. As f u {t) is clearly continuous and nonnegative, to show f v (t) is a proba- 
bility density it suffices to show that it integrates to 1. We must therefore show 
that 



4 The mean fx of a distribution is its average value: fi = J xg(x)dx. The standard deviation a 
measures how spread out a distribution is about its average value: a 2 — fix — fi) 2 g(x)dx. 
5 Student was the pen name of William Gosset. 
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As the integrand is symmetric, we may integrate from to infinity and double the 
result. Letting t = y/v tan 6 (so dt = y^sec 2 6d9) we find 



cos 17 " 1 



The proof follows immediately from two properties of the Beta function (see [2, p. 
560]): 

B(p,q) = T{p)r(q)/T(p + q) 

rn/2 

B(m + l,n + l) = 2 / cos 2m+l (9) sin 2n+1 (6) d6; (3) 

J o 

an elementary proof without appealing to properties of the Beta function is given 
in Appendix |A] □ 

The Student t-distribution arises in statistical analyses where the sample size 
v is small and each observation is normally distributed with the same mean and 
the same (unknown) variance (see 00 CGI). The reason the Student t-distribution 
is used only for small samples sizes is that as v — > do, f v (t) converges to the 
standard normal; proving this will yield Wallis's formula. While we can prove this 
by invoking the Central Limit Theorem, we may also see this directly by recalling 
that 

,. (, X ^ N 
e x = hm 1 + — 

AT^oo V N 



We therefore have 

hm I i - ; IS) - - <-> 2 ^. 



As f u (t) is a probability distribution for all positive integers u, it integrates to 1 for 
all such v , which is equivalent to 



\ 2 



Taking the limit as v — > oo yields 



v + l 

lim — = lim / I 1 + — ] dt 

v->oc c,, v-+oc V V 



1 f°° ( t 2 



v + 1 

f2\ — r°° 



V 



Jim f 1 + - ) dt = I e-' 2 l 2 dt = v 7 ^. 



Some work is necessary of course to justify interchanging the integral and the limit; 
this justification is why our argument is only "mostly elementary". A standard 
proof uses the Dominated Convergence Theorem (see [7, p. 54] or [9, p. 238]) to 
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show that as v — > oo the i-distribution converges to the standard normaljlone may 
take 2008 exp(— i 2 /2008) as the dominating function. We have therefore shown 
that 

r (*±i) 1 
c = lim c„ = lim 2 . p - = (4) 

u^oo f-+oo yVl/ T (|J V27T 

The fact that c = 1/ y/2~K is the key step in our proof of Wallis's formula. We have 
calculated the limit by using our observation that a probability distribution must 
integrate to 1; calculating it by brute force analysis of the Gamma factors yields 
our main result. 



Theorem: Wallis's formula is true. 

Proof. The proof follows from expanding the Gamma functions and substituting 
into we highlight the main steps. Let v = 2m. Using T(n + 1) = nY(n) and 
T(l/2) = 0r we find that 



2m + 1 



= (2m-l)(2m-3)---5-3-l- ^/2 r 



2 

As T(m) = (m — 1)!, after some algebra we find that 

1 • 3 • 5 • • • (2m - 3) • (2m - 1) ^m 



C2m 



2-4-6 • • • (2m - 2) -2m y/2~' 
Multiplying by 1 • (2m + l)/(2m + 1) and regrouping, we find 

1-3 3-5 (2m-l)(2m+l) 1 Y 2 ^/m~ 



C-2, 



i 



2-2 4-4 2m -2m 2m + 1 J y/2 ' 

which we rewrite as 

2n-2n 2-2 4-4 2m ■ 2m m 



n 



£1 (2n-l)(2n + l) 1-3 3-5 (2m-l)(2m + l) (4m + 2)c\ m 
As linim^oo = 2tt and linv^oo m/(4m + 2) = 1/4, we have 

nzn ■ 2n it 
(2n- l)(2n + 1) ~ 2' 

n=l 

which completes the proof. □ 

By combining the expansion for 7r from Wallis's formula with those involving 
the Bernoulli numbers and the zeta function, we obtain a proof of the following 
amusing formula for log(7r/2). 



''For completeness we sketch how such an argument could proceed. If t G [— log 2 v, log 2 u] then 
|(1 + t 2 /uy { " +1)12 - exp(-t 2 /2)| tends to zero rapidly with v. Further, if f(t) is the density of 
the standard normal, then Ji t i >log 2 v f(t)dt and Ji t i >log 2 „ fv(t)dt also tend to zero rapidly with v. 
Careful bookkeeping shows that the normalization constants c„ must therefore approach c = 1/ v2tt- 
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Theorem: We have 



fc=i k=i 



Proof. The Taylor series of log(l — x) is — Y^=i x k jk. The nth factor in Wal- 
lis's formula may be written as (l — 
formula and Taylor expanding yields 



lis's formula may be written as (1 — 1 . Thus taking logarithms of Wallis's 



log 2 = "X^M 1- ^ ) = ZZ fc. (4n 2 ) fc = ^4M^#' 

n=l v 7 n=l fc=l v ; fe=l n=l 

The n-sum gives C(2fc), and the claim now follows. □ 

Note that the above formula for log(-7r/2) converges well. It is easy to see that 
|C(2&)| < 2 (and lim^oo (,{2k) = 1). Thus each additional summand yields at 
least one new digit (base 4). See [16] for additional formulas for log(7r/2). 
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Appendix A. Elementary calculation of constants 

For completeness, we provide a more elementary derivation that the stated con- 
stant is the correct normalization constant for the Student f-distribution. 



Lemma A.l. We have 



- r' 2 1 v^r(s) 

2^/D / cos"- 1 9d6 = V . . (6) 



Proof. The claim follows by induction; we sketch the main idea. Assume we have 
proven the claim for all v < n. Then 

-tt/2 pn/2 



rn/z r^/z 

/ cos™ +1 6^ = / (1 - sin 2 9) cos™" 1 
Jo Jo 



tt/2 pit/2 

cos™" 1 Ode - sin0 • (cos™- 1 9sm9) d8. 



o 



We integrate the second term on the right by parts, with u = sin# and dv 
cos™ -1 9 sin 9d6. The uv term vanishes at and tt/2 and we are left with 



rvr/2 pn/2 i rn/2 

cos n+1 9d9 = / cos™" 1 6d9-- cos n 9cos9d9, 
Jo n Jo 
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which simplifies to 



/ 

Jo 



■tt/2 



n 



! 

Jo 



■tt/2 



COS 



+1 9d6 = 



cos 



,n-l 



9d6. 



n + 1 



The claim now follows from standard properties of the Gamma function (T(m + 
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1) = mY{m) and r(l/2) = y/w). 



□ 
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